OODB Bulk Loading Revisited: The Partitioned-List Approach

نویسندگان

  • Janet L. Wiener
  • Jeffrey F. Naughton
چکیده

Object-oriented and object-relational databases (OODB) need to be able to load the vast quantities of data that OODB users bring to them. Loading OODB data is significantly more complicated than loading relational data due to the presence of relationships, or references, in the data; the presence of these relationships means that naive loading algorithms are slow to the point of being unusable. In our previous work, we presented the late-invsort algorithm, which performed significantly better than naive algorithms on all the data sets we tested. Unfortunately, further experimentation with the lateinvsort algorithm revealed that for large data sets (ones in which a critical data structure of the load algorithm does not fit in memory), the performance of late-invsort rapidly degrades to where it, too, is unusable. In this paper we propose a new algorithm, the partitioned-list algorithm, whose performance almost matches that of late-invsort for smaller data sets but does not degrade for large data sets. We present a performance study of an implementation within the Shore persistent object repository showing that the partitioned-list algorithm is at least an order of magnitude better than previous algorithms on large data sets. In addition, because loading gigabytes and terabytes of data can take hours, we describe how to checkpoint the partitioned-list algorithm and resume a long-running load after a system crash or other interruption.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bulk Loading into an OODB: A Performance Study

Object-oriented database (OODB) users bring with them large quantities of legacy data (megabytes and even gigabytes). In addition, scienti c OODB users continually generate new data. All this data must be loaded into the OODB. Every relational database system has a load utility, but most OODBs do not. The process of loading data into an OODB is complicated by inter-object references, or relatio...

متن کامل

An Evaluation of Vertical Class Partitioning for Query Processing in Object-Oriented Databases

-Vertical partitioning is a design technique for reducing the number of disk accesses to execute a given set of queries by minimizing the number of irrelevant instance variables accessed. This is accomplished by grouping the frequently accessed instance variables as vertical class fragments. The complexity of object-oriented database models due to subclass hierarchy and class composition hierar...

متن کامل

Evaluation of Effectiveness of Main Factors on the Reduction of Loading and Discharging Performance Versus Loading and Discharging Rate of Dry Bulk Terminal (Case Study of Imam Khomeini Port)

The aim of this article is to measure the impact of main factors affecting the reduction of discharge and loading performance compared to dry bulk discharge and loading in terminal of Imam Khomeini Port. For this purpose, the actual data presented in Imam Khomeini Port for discharging and loading statistics and library documented data were used. In order to answer the research questions, multip...

متن کامل

A Vertical Partitioning Algorithm for Distributed Object Oriented Databases

Object Oriented Databases (OODB) is becoming popular day by day and being used in a large number of application domains. In order to support homogeneous distributed OODBs a clear understanding of partitioning of class and how to do it by using different partitioning algorithms is needed. In this paper an algorithm for vertical fragmentation in a model consisting of class and comprising of compl...

متن کامل

Modeling and Simulating a Software Architecture Design Space

Frequently, a similar type of software system is used in the implementation of many different software applications. Databases are an example. Two software development approaches are common to Þll the need for instances from a class of similar systems: (1) repeated custom development of similar instances, one for each different application, or (2) development of one or more general purpose off-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995